Discrimination-aware Network Pruning for Deep Model Compression

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Pruning for Deep Neural Network Compression

In this work we present a method to improve the pruning step of the current state-of-the-art methodology to compress neural networks. The novelty of the proposed pruning technique is in its differentiability, which allows pruning to be performed during the backpropagation phase of the network training. This enables an end-to-end learning and strongly reduces the training time. The technique is ...

متن کامل

A Deep Neural Network Compression Pipeline: Pruning, Quantization, Huffman Encoding

Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources. To address this limitation, We introduce a three stage pipeline: pruning, quantization and Huffman encoding, that work together to reduce the storage requirement of neural networks by 35× to 49× without affecting their accuracy. Our method...

متن کامل

Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding

Deep Compression is a three stage compression pipeline: pruning, quantization and Huffman coding. Pruning reduces the number of weights by 10x, quantization further improves the compression rate between 27x and 31x. Huffman coding gives more compression: between 35x and 49x. The compression rate already included the metadata for sparse representation. Deep Compression doesn’t incur loss of accu...

متن کامل

Compression-aware Training of Deep Networks

In recent years, great progress has been made in a variety of application domains thanks to the development of increasingly deeper neural networks. Unfortunately, the huge number of units of these networks makes them expensive both computationally and memory-wise. To overcome this, exploiting the fact that deep networks are over-parametrized, several compression strategies have been proposed. T...

متن کامل

Universal Deep Neural Network Compression

Compression of deep neural networks (DNNs) for memoryand computation-efficient compact feature representations becomes a critical problem particularly for deployment of DNNs on resource-limited platforms. In this paper, we investigate lossy compression of DNNs by weight quantization and lossless source coding for memory-efficient inference. Whereas the previous work addressed non-universal scal...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Pattern Analysis and Machine Intelligence

سال: 2021

ISSN: 0162-8828,2160-9292,1939-3539

DOI: 10.1109/tpami.2021.3066410